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Electroencephalogram (EEG) signals are low-amplitude electrical signals that measure the electrical activity 
between electrodes from the scalp and neurons in the brain. Successful studies have been carried out in many 
different areas for the detection of many neurological diseases, especially epilepsy, using EEG signals. In this 
study is aimed to compare different spectral analysis methods on EEG data. For this purpose, three different 
feature vectors were created by calculating power spectrum densities between 1-49 Hz using three different 
spectral analysis methods: Periodogram, Welch, and Multitaper. The performances of the three spectral 
analysis methods were compared by classifying them with the Support Vector Machine (SVM) algorithm 
using the created feature vectors. The accuracy rate of the Periodogram and SVM model was 92.30 %, the 
accuracy rate of the Welch and SVM model was 96.16 %, the accuracy rate of the Multitaper and SVM model 
was 94.48 %. The model with the highest performance is the classification model that effectively combines the 
Welch method and the SVM algorithm. 


Keywords: Spectral analysis, EEG, Welch, Periodogram, Multitaper 


© 2022 Published by Alntelia 


1. Introduction 


The electrical activity between brain neurons is linked to motor, cognitive, perceptual, and emotional processes. 
From the scalp, EEG monitors brain messages and electrical activity. The normal electrical activity of the brain is an 
auxiliary test in the diagnosis of many neurological problems, especially epilepsy. Although it has not been fully 
revealed yet, it is known that a large amount of information is stored in these signals obtained from the human brain 


[1]. 


EEG is a completely painless and harmless examination method. In addition, EEG is preferred due to its lower cost 
and less equipment requirement compared to other neuroimaging techniques such as magnetic resonance imaging 
(MRI), functional magnetic resonance imaging (fMRI), computed tomography (CT), and functional near infrared 
spectroscopy (fNRIS) [2]. Because of its portability, noninvasive nature, relatively simple and inexpensive 
equipment, EEG is the most widely used technique in the clinic. Therefore, findings obtained using EEG signals can 
be easily exported for daily clinical use [3]. Clinical analysis of EEG signals aids in disease management and 
prognosis. Thanks to the latest developments in biomedical signal processing, multi-resolution analysis of EEG 
signals is possible in the diagnosis of diseases [4]. A number of signal processing methods are applied to analyze 
EEG signals by converting them from time domain to frequency domain. 


Spectral analysis is a method of examining the characteristic of EEG signals according to frequency and estimating 
how the power of a signal is distributed. In spectral analysis is aimed to reveal the repetitive and hidden behaviors of 
the signal. The distribution and characteristics of EEG signals in frequency space can be found with power spectrum 
density [5]. Spectral analysis on EEG can be performed in a wide frequency range, as well as in defined sub-bands of 
the EEG signal. EEG signals are basically divided into five frequency bands: delta (5), theta (8), alpha (a), beta (B) 
and gamma (y) [6]. For a long time, power spectrum estimate approaches have been utilized in the study of EEG 
signals. There are many studies in the literature using spectral analysis methods and EEG data, epileptic seizure 
detection [7-9], sleep disorder detection [10], biometric verification [11], classification of mental tasks [12] and 
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detection of major depressive disorder [13]. In order to estimate the power spectrum intensities of EEG signals, it is 
seen that Periodogram [11], Welch [8, 10] and Multitaper [9] methods are widely used. 


In this study is aimed to compare different spectral analysis methods on experimental EEG dataset. For this purpose, 
three different feature vectors were created by calculating power spectrum densities between 1-49 Hz using three 
different spectral analysis methods: Periodogram, Welch and Multitaper. The performances of the three spectral 
analysis methods were compared using the created feature vectors and SVM algorithm. 


2. Materials and Methods 


In the study, three different spectral analysis methods, namely Periodogram, Welch and Multitaper, were compared 
on EEG dataset. The general block diagram of the study is given in Figure 1. 


Periodogram —-» PSD Estimation 


Welch ——-» PSDEstimation ———> Classification 


Multitaper © ——» PSD Estimation 


Figure 1. The general block diagram of the study 
A. Dataset: 


The experimental EEG dataset that is used in this study was obtained from physionet.org, a huge repository website 
for data scientists and machine learning practitioners [14]. EEG signals were recorded from 36 subjects (27 females, 
9 males) before and during mental arithmetic tasks by a neurophysiologist specialized in EEG visual examination. 
The Bioethics Commission of the Education and Scientific Center of the Taras Shevchenko National University of 
Kyiv "Institute of Biology and Medicine" accepted the EEG dataset utilized. Each participant gave written informed 
consent in accordance with the World Medical Association (WMA) Declaration of Helsinki. Inclusion criteria of the 
participants in the study; normal or corrected-normal visual acuity, normal color vision, and the absence of clinical 
signs of mental or cognitive impairment, verbal or nonverbal learning difficulties. Exclusion criteria were 
psychoactive drug use, alcohol or drug addiction, and presence of neurological or psychiatric complaints [15]. 


B. Experiment design: 


EEG signals were recorded using the Neurocom EEG 23-channel system, 21 channels and 2 reference ear electrodes. 
Participants are given the arithmetic task of subtracting two numbers serially during the experiments. Each trial 
begins with verbally subtracting the 2-digit number from the 4-digit number (example: 3141 — 42). All recordings are 
60 second artifact-free EEG segments. EEG recordings of each participant were taken before and during the mental 
arithmetic task. Figure 2 shows how all electrodes were placed on the scalp using the international 10/20 technique. 
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Figure 2. Electrodes positioning for the international 10-20 system 


C. Training and Test Datasets 


In the study, the Holdout method was used while the dataset was divided into test and training datasets. In the 
holdout method, a certain amount of sample is separated as test dataset, the remaining data is used as training dataset. 
There are 4536 pieces of data in total in the dataset. In order to control the performance of the classification models, 
these data are divided into 2/3 as training dataset (3024) and 1/3 as test dataset (1512) by holdout method. The model 
is trained using the training dataset and the performance of the model is evaluated using the test dataset. 


If there is a number of data representing each class and the dataset is evenly distributed, the Holdout method is 
widely used [16]. In the study, the number of representations belonging to each class is distributed adequately and 
evenly. 


D. Spectral Analysis: 


Spectral analysis methods show how a stationary, random and finite-length signal is distributed over the frequency 
band. In spectral analysis is aimed to reveal the repetitive and hidden behaviors of the signal [5]. Power spectral 
density describes the power distribution of a signal over the frequency range. Periodogram, Welch and Multitaper 
spectral analysis methods are widely used to estimate the power spectral density. 


Periodogram method is the simplest form of spectral analysis methods. The Periodogram is applied directly to the 
EEG signals. It is the most basic form of extraction of signals and power spectra. It is calculated as in Equation 1. 
(~(w) represents the power spectrum density, N is the sample number of the signal, y(t) is the spectrum of the signal, 
and w is the frequency which is the power spectrum density. 


2 (1) 
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The Welch method is a form of calculation that finds the weighted sum of the periodograms of the overlapping 
windows of the signal. In the method, the signal is windowed to create overlapping segments. Then the square size of 
the Discrete Fourier Transform (DFT) is calculated for each segment. Finally, the average PSD for each separated 
segment gives the Welch periodograms [17]. The Welch method is calculated as in Equation 2. 
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In Equation 2, P is the number of windowed segments, [”,.(w) is the Periodogram calculated per windowed segment, 
and [.(w) is the average of the periodograms. 
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The Multitaper method is used to obtain the power spectral density by carrying the information contained in a signal 
into the frequency space. The power spectrum is created by distributing the average power of a signal over certain 
frequency values in the signal. The spectral density of the Multitaper method is calculated by Equation 3. The K in 
Equation 3 represents the number of filters to be used and AK—n the filter impulse [18]. 


K 7 (3) 
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3. Experimental Results 


In this study, feature vectors were created by calculating the power spectrum densities of the EEG signals on the 
experimental EEG dataset with three different spectral analysis methods. The performance of classification models 
created with these feature vectors and SVM machine learning algorithm has been compared. The spectral power 
densities of the frequencies between 1-49 Hz of the EEG signals recorded from 36 participants were calculated using 
Periodogram, Welch and Multitaper spectral analysis methods. The raw 10-second EEG signals of mental workload 
and resting states from the Fp1 channel are given in Figure 3. Raw EEG signals from other channels have similar 
characteristics. 
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Figure 3. Raw EEG signal graph 


The power spectral density values obtained from the Fp! channel using the Periodogram method of mental workload 
and resting states are given in Figure 4. Power spectral density values obtained from other channels also have similar 
characteristics. 
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Figure 4. Power spectral density graph calculated by the Periodogram 


In the Welch method, the window length is 4 of the data size and the “Noverlap” parameter is 4 of the window 
length were selected. The power spectral density values obtained from the Fp1 channel using the Welch method of 
mental workload and resting states are given in Figure 5. 
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Figure 5. Power spectral density plot calculated by the Welch 


The power spectral density values obtained from the Fp] channel using the Multitaper method are given in Figure 6. 
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Figure 6. Power spectral density plot calculated by the Multitaper 


In the study, three different experiments were carried out using the Periodogram, Welch and Multitaper methods. In 
all experiments, SVM machine learning algorithm was used together with these methods. The experimental EEG 
dataset consists of EEG signals from 36 subjects, 21 channels, resting and mental workload tasks, with a sampling 
frequency of 500. To these data (36 subject x 21 channels x 2 situations = 1512), the number of feature vectors was 
increased by using the amplifying augmentation method. The primary goal of data augmentation is to provide 
enough data for the model to make more accurate predictions. Data augmentation eliminates the overfitting problem 
of the model. Using the data augmentation method, the feature vector numbers were multiplied by the factors of 0.98 
and 1.02. Thus, the number of feature vectors has been increased threefold. These represent variations in the 
amplitude of brain waves depending on factors such as electrode-tissue impedance [19-20]. As a result, there are 
4536 data (1512 x 3) in total in the dataset with data augmentation. 


In order to control the performance of the classification models, these data were divided into 2/3 training dataset 
(3024) and 1/3 test dataset (1512) using the holdout method. While the model was trained with the training dataset, 
the success of the model was checked with 1512 independent test datasets. Kernel function “rbf’ was chosen in SVM 
algorithm. The parameters in the confusion matrix of the experiments in which different spectral analysis methods 
were used are given in Table 1. 


Table 1. The parameters in the confusion matrix of the experiments in which different spectral analysis methods 


Experiments | Method Classification | Confusion matrix parameters 

TP | FP | FN | TN | FP+FN | TP+TN 
Experiment | | Periodogram | SVM 725 | 53 | 68 | 666 | 121 1391 
Experiment 2 | Welch SVM 731]}8 | 50 | 723 | 58 1454 
Experiment 3 | Multitaper SVM 745 |2 | 85 | 680 | 87 1425 


When Experiment 1, Experiment 2 and Experiment 3 are examined in Table 1, the highest number of correctly 
classified data belongs to the Welch method, one of the spectral analysis methods. In the confusion matrix of 
Experiment 2 using Welch and SVM algorithm, TP value is 731, FP value is 8, FN value is 50, TN value is 723, total 
number of incorrectly classified samples (FP+FN) is 58, and total number of correctly classified samples (TP+TN) is 
1454. The results of Experiment 2, where the feature vectors extracted by the Welch method and the SVM machine 
learning algorithm are used, have the highest number of correctly classified samples. 


Using these parameters obtained from the confusion matrix, sensitivity, specificity, precision, fl-score, MCC and 
accuracy model performance criteria are calculated. The performances of the models are evaluated using these 
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performance criteria. According to the experiments, the performance results of the classification models are given in 
Table 2 


Table 2. Performance analysis of classification models 


Experiments Classification models Model performance evaluation 


Sensitivity Specificity Precision Fl-score MCC Accuracy 


Experiment 1 Periodogram+SVM 0.914 0.926 0.931 0.923 0.839 92.30% 
Experiment 2 Welch +SVM 0.936 0.989 0.989 0.961 0.924 96.16% 
Experiment 3 Multitaper +SVM 0.897 0.997 0.997 0.944 0.890 94.48% 


When the performance analyzes of the classification models are examined in Table 2, the highest performance 
belongs to the Welch and SVM classification model, in which the power spectral density values obtained by the 
Welch method are used. In Experiment 2, the performance analysis results of the Welch and SVM classification 
model were calculated as 0.936 sensitivity, 0.989 specificity, 0.989 precision, 0.961 fl score, 0.924 MCC, and 
96.16% accuracy. The values of the model performance criteria should be close to 1. The fact that these values are 
close to 1 indicates that the model does not have a random success [21]. In Figure 7, the performance analysis results 
of the SVM algorithm according to the spectral analysis methods are given. 
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Figure 7. Performance results of classification algorithms according to spectral analysis methods 


Of the three spectral analysis methods, Periodogram, Welch and Multitaper, whose performance results were 
compared, the Welch method showed the highest accuracy in this EEG dataset. The sensitivity, fl-score, MCC and 
accuracy values of the model performance criteria of the Welch and SVM classification model were higher than the 
other classification models. 


4. Conclusion 


As a result, the performances of Periodogram, Welch and Multitaper spectral analysis methods on EEG signals were 
compared. Three separate experiments were carried out using the power spectral densities calculated with the 
Periodogram, Multitaper and Welch spectral analysis methods. In the experiments, firstly, feature vectors were 
obtained by calculating the power spectrum densities of the EEG signals between 1-49 Hz with these spectral 
analysis methods. The results obtained using these feature vectors and the SVM algorithm were analyzed according 
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to the model performance criteria. The classification model constructed with the Welch technique spectral analysis 
method and SVM algorithm has the best performance, according to the experimental results. Performance analysis 
results of the Welch and SVM classification model were calculated as 0.936 sensitivity, 0.989 specificity, 0.989 
precision, 0.961 fl score, 0.924 MCC, and 96.16% accuracy. The fact that these values are close to | indicates that 
the model does not have a random success. It is thought that the proposed model will support classification studies 
and show high performance by applying it to different EEG datasets. 
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